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ABSTRACT 

Situation  assessment  is  the  first  step  in  the  Command  and 
Control  process.  In  naval  tactical  teams,  it  has  become  more 
critical  even  as  it  has  become  more  difficult. 

Part  of  the  Navy's  attempt  to  address  this  issue  is  the 
Tactical  Decision  Making  Under  Stress  (TADMUS)  program.  Under 
TADMUS,  the  Situation  Assessment  In  Naval  Teams  (SAINT) 
experiment  was  run  at  NPS  in  December,  1991.  This  thesis 
describes  the  SAINT  experiment  and  uses  data  collected  during 
the  experiment  to  study  the  effects  of  team  leader  feedback  on 
situation  assessment  in  distributed  air  defense  teams.  The 
emphasis  of  study  is  on  performance,  (error  rate  and  pattern) , 
subjective  workload,  and  communication  rates. 

Findings  include:  feedback  of  the  leader's  current 
assessment  lowers  explicit  coordination;  feedback  does  not 
affect  subjective  workload;  feedback  increases  error  rates, 
and  may  affect  error  patterns.  Evidence  of  feedback  causing 
confirmatory  bias  was  also  found,  but  more  research  in  this 
area  is  recommended. 
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I.    INTRODUCTION 

A.   BACKGROUND 

In  the  past  five  years,  the  U.S.  Navy  has  seen  a  profound  shift  in  the  threat  it  must 
meet.  The  probability  of  a  full-scale  conventional  or  nuclear  global  war  with  a 
monolithic,  centrally  controlled  superpower  has  vanished  along  with  the  Soviet  Union  and 
the  Warsaw  Pact.  Unfortunately,  the  luxuries  of  a  single,  longtime  foe,  (e.g.,  detailed 
planning,  well  known  tactics,  a  developed  warning  system,  even  a  certain  predictability 
of  threat),  have  also  vanished.  Containment  of  communism  has  been  replaced  with 
maintenance  of  global  stability. 

Today's  Navy  is  faced  with  the  challenge  of  a  growing  number  of  nations  which 
possess  sophisticated  weaponry,  including  weapons  of  mass  destruction.  Increasing 
emphasis  is  placed  on  regional  conflict  scenarios.  Such  conflicts  are  typically  not  blue 
water  engagements  with  inherent  warnings  and  space  to  maneuver.  They  are  typically 
near  land.  A  different  set  of  threats  to  the  surface  operator  are  presented  in  this  near 
land  operating  area  (NLOA):  anti-ship  missiles  launched  from  shore  or  from  highly 
maneuverable  patrol  craft;  shallow  water  mines;  shore-based  enemy  aircraft  and  a  360 
degree  threat  sector.  The  results  are  less  room  for  maneuver  and  far  shorter  reaction 
times  to  a  wider  spectrum  of  threats. 

In  response,  the  Navy  is  developing  a  "stability  strategy"  which  focuses  on  two 
regional  contingencies:    preventing  conflict  where  it  can;  and  engaging  in  combat  only 


when  it  must  [Ref.  l:p.  3].  The  success  of  both  are  dependent  on  correctly  assessing  the 
current  tactical  situation.  Effective  command,  control,  communications  (C3)  begins  with 
effective  situation  assessment.  Unfortunately,  "The  uncertainty  of  the  period  makes 
warning  signs  even  more  ambiguous,  reaction  times  even  shorter,  the  identity  and 
motives  of  potential  adversaries  more  vague  and  the  timing  and  scenario  of  unfolding 
events  more  difficult  to  discern."  [Ref.  l:p.  2]  In  short,  situation  assessment  in  naval 
teams  has  become  more  critical  even  as  it  has  become  more  difficult. 

In  the  area  of  anti-air  warfare  (A AW),  the  problem  is  especially  acute.  Detect-to- 
engage  sequences  have  been  reduced  to  minutes;  in  some  cases  even  to  seconds.  Yet 
Combat  Information  Center  (CIC)  AAW  teams  are  still  trained  to  fight  the  traditional, 
blue  water  engagement  with  the  bulk  of  the  fighting  taking  place  in  an  outer  air  battle 
(OAB)  100  to  250  nautical  miles  from  the  main  force.  This  is  done  under  the 
assumption  that  doctrine  designed  for  a  blue  water  engagement  in  a  full  scale,  declared 
war  is  also  good  for  a  near-land  CALOW  (Crisis  and  Limited  Objective  Warfare) 
operation.    This  is  a  dangerous  assumption  and  one  that  is  under  critical  re-evaluation. 

Part  of  this  re-evaluation  effort  is  the  Navy's  Tactical  Decision  Making  Under 
Stress  (TADMUS)  program.  The  purpose  of  TADMUS  is  to  provide  a  better 
understanding  of  individual  and  team  behavior  in  distributed  naval  decision  making 
environments  under  high  stress  conditions  in  order  to  support  the  development  of  new 
training  procedures  and  non-intrusive  decision  aides  [Ref.  2].  Under  the  TADMUS 
initiative,  ALPHATECH  INC.  has  developed  an  experiment  to  study  situation  assessment 
in  naval  teams  (SAINT)  [Ref.  3:p.  15]. 


The  first  SAINT  experiment  was  run  at  the  Naval  Postgraduate  School  in 
December,  1991.  This  thesis  will  describe  the  SAINT  experiment  and  will  use  data 
collected  during  the  experiment  to  study  the  effects  of  team  leader  feedback  on  situation 
assessment  in  distributed  air  defense  teams. 

B.      OBJECTIVE 

The  objective  of  this  thesis  is  to  identify  actions  or  behaviors  that  contribute  to 
performance  under  conditions  of  high  stress.  The  emphasis  was  placed  on  leader 
feedback  to  subordinate  decision  makers  concerning  his  opinion  of  the  hostility  of  a  given 
contact.  This  seemed  the  most  likely  area  in  which  changes  in  current  Navy  training 
structures  could  be  effected. 

1.     Research  Questions 

The  first  three  research  questions  do  not  relate  directly  to  the  thesis.  However, 
if  the  data  from  the  experiment  is  to  be  used,  these  questions  must  be  answered 
affirmatively.  If  the  independent  "stressor"  variables  have  no  effect  on  subjective 
workload,  they  can  not  be  termed  "stressors",  and  no  statements  can  be  made  concerning 
their  relationships  to  dependent  variables  in  the  context  of  stress.  The  research  questions 
are  as  follows: 

•  Does  stress  due  to  time  pressure  increase  subjective  workload? 

•  Does    stress    due    to    uncertainty,    (garbled    information),    increase    subjective 
workload? 

•  Does  stress  due  to  high  target  ambiguity  increase  subjective  workload? 

•  Does  leader  feedback  lower  communication  rates? 


•  Does  leader  feedback  lower  subjective  workload? 

•  Does  leader  feedback  lower  a  team's  overall  error  rate? 

•  Does  leader  feedback  affect  the  error  pattern,  (number  of  false  alarms  versus 
misses)? 

2.      Predictions 

Based  on  a  survey  of  the  literature,  an  attempt  was  made  to  predict  answers 
to  the  research  questions.  This  was  not  possible  in  all  instances. 

With  respect  to  time  pressure,  it  was  expected  that  subjective  workload  would 
increase  as  time  pressure  increased.  In  an  experimental  study  on  hierarchical  team 
coordination,  Wang  and  Serfaty  showed  that  this  was  the  expected  pattern  [Ref.  4:p.l5]. 

It  was  expected  that  increasing  levels  of  uncertainty,  (induced  by  garbled 
information),  would  increase  the  subjective  workload  by  forcing  decision  makers  (DMs) 
to  probe  more  often  to  get  the  required  information  [Ref.  5].  The  stress  associated  with 
receiving  no  reward,  (information),  after  performing  the  correct  task,  (probe),  was  also 
expected  to  add  to  the  subjective  workload. 

The  stress  associated  with  high  target  ambiguity,  (difficulty  in  discrimination 
between  hostile  and  neutral),  was  also  expected  to  increase  subjective  workload.  En  tin 
and  Serfaty  report  that  as  ambiguity  increases,  so  does  subjective  workload.  Their 
results  were  similar  under  both  high  and  low  time  pressure.    [Ref.  6:p.  46] 

Does  leader  feedback  lower  communication  rates?  It  is  known  that  as  time 
pressure  increases,  teams  adapt  to  the  increasing  subjective  workload  by  reducing  rates 
of  explicit  coordination  [Ref.  5:p.  16].   It  has  also  been  hypothesized  that  the  ability  of 


teams  to  coordinate  implicitly  is  the  result  of  shared  mental  models  of  both  the  task  at 
hand  and  the  capabilities  of  team  members  [Ref.  7:p.  1].  Furthermore,  expert 
commanders  "...communicate  their  intent  and  understanding  of  the  situation  frequently 
in  order  to  maintain  a  common  mental  model  of  the  situation,  an  essential  feature  to 
facilitate  implicit  coordination  in  the  team."  [Ref.  8:p.8]  It  was  therefore  expected  that 
feedback  of  the  leader's  current  assessment  of  the  contact  would  lower  communication 
rates  by  facilitating  implicit  coordination. 

If,  as  expected,  feedback  lowers  explicit  coordination,  it  should  also  lower 
subjective  workload.  This  may  not  hold  true  in  the  instance  of  low  time  pressure,  but 
as  time  pressure  increases  and  the  need  for  implicit  coordination  rises  with  it,  feedback 
should  be  seen  as  a  factor  that  helps  maintain  workload  at  an  acceptable  level.  At  the 
very  least,  workload  should  be  less  under  high  time  pressure  with  feedback  than  under 
high  time  pressure  without  feedback.  This  should  also  hold  true  for  the  other  stressors, 
such  as  high  uncertainty  and  high  ambiguity. 

With  respect  to  the  last  two  research  questions,  there  appears  to  be  little 
empirical  research  that  has  studied  the  merits  of  feedback,  (as  narrowly  defined  in  the 
SAINT  experimental  paradigm),  on  team  performance.  However,  studies  have  shown 
that  the  assessment  of  a  situation  is  captive  to  the  most  recent  information  received  by 
the  decision  maker,  since  all  hypotheses  under  consideration  do  not  have  the  same  prior 
probability  of  occurring  [Ref.  9:p.  34].  This  recency  effect  is  compounded  by  the  fact 
that  people  have  cognitive  limitations  that  only  allow  them  to  maintain  a  few  hypotheses 
about  a  current  situation  at  any  given  time  [Ref.  9:pp.   34-35].     This  is  further 


complicated  in  that  people  do  not  seek  or  apply  information  objectively  in  an  effort  to 

confirm  or  refute  the  few  hypotheses  they  do  maintain.   Rather,  they  frequently  exhibit 

"confirmatory  biases".   New  information  is  sought  and  incoming  information  is  filtered 

to  confirm  rather  than  test  a  current  assessment  of  the  situation  [Ref.  9: p.  44].  This  can 

have  tragic  results. 

In  the  case  of  the  USS  Vincennes  downing  Iran  Air  Flight  655,  the  crew 

appeared  to  exhibit  a  classic  case  of  such  confirmatory  bias. 

TIC  (Tactical  Information  Coordinator)  and  IDS  (Identification  Supervisor)  became 
convinced  track  4131  was  an  Iranian  F-14  after  receiving  the  IDS  report  of  a 
momentary  Mode-II.  After  this  report  of  the  Mode-II,  TIC  appears  to  have 
distorted  data  flow  in  an  unconscious  attempt  to  make  available  evidence  fit  a 
preconceived  scenario.  [Ref.  10:p.  45] 

Also, 

In  the  final  minute  and  forty  seconds,  the  AAW  (Anti-Air  Warfare  officer)  tells  his 
captain,  as  a  fact,  that  the  aircraft  has  veered  from  the  flight  path  into  an  attack 
profile,  and  is  rapidly  descending  at  increasing  speed  directly  towards  USS 
Vincennes.  Even  though  the  tone  of  these  reports  must  have  seemed  increasingly 
hysterical  ...  the  AAW  made  no  attempt  to  confirm  the  reports  on  his  own.  Quick 
reference  to  the  CRO  (character  read-out)  on  the  console  directly  in  front  of  him 
would  have  immediately  shown  increasing  not  decreasing  altitude.  .  .  .  (He)  relied 
on  the  judgement  of  one  or  two  second  class  petty  officers,  buttressed  by  his  own 
preconceived  perception.    [Ref.  ll:p.  5] 

The  crew  expected  an  air  attack  and  all  incoming  information  was  construed 

as  confirming  an  earlier  call  by  IDS  of  track  4131  as  "Iranian  F-14".  Despite  repeated 

indications  of  an  ascending  contact  squawking  constant  Mode-Ill,  the  AAW  team 

persisted  in  its  assessment  of  a    descending  contact  squawking  Mode-II.    This  biased 

interpretation  of  the  available  data  was  the  only  one  transmitted  to  the  captain,  who 


sought  and  considered  only  this  interpreted  assessment.     He  did  not  seek  any  raw 
measurements  of  his  own.    [Ref.  10:pp.  1-45] 

The  last  two  research  questions  may  give  some  insights  into  whether  feedback  of 
the  leader's  current  assessment  of  the  situation,  (i.e.,  hostile/neutral),  intensifies  or 
mitigates  the  phenomena  of  recency  and  confirmatory  bias. 


H.    EXPERIMENTAL  DESIGN 

A.      OVERVIEW 

The  SAINT  experimental  paradigm  is  a  modification  of  the  Distributed  Dynamic 
Decisionmaking  (DDD-II)  paradigm  developed  by  Kleinman,  Serfaty  and  Luh  in  1984 
and  updated  by  Kleinman  and  Serfaty  in  1989.  For  the  SAINT  experiment  run  in 
December  of  1991,  the  task  was  a  CIC-type  distributed  situation  assessment  problem 
faced  by  a  four-person  hierarchical  command  team  [Ref.  3:p.  16].  The  primary  goal  of 
the  experimental  paradigm  was  for  the  team  to  collect,  evaluate  and  fuse  data  concerning 
an  inbound  contact  in  order  to  infer  correctly  its  hostility  or  neutrality  in  a  timely 
fashion.  The  simulated  environment  was  an  analogue  of  the  anti-air  warfare  (AAW) 
team  of  the  Combat  Information  Center  on  a  cruiser.  The  four-person  team  assessing 
the  contact  was  an  analogue  of  the  tactical  action  officer  (TAO)  and  three  of  his  support 
staff. 

Each  of  the  three  subordinate  team  members  performs  a  different  task. 
ALPHATECH  INC.'s  original  paradigm  for  SAINT,  as  set  forth  in  their  technical 
proposal  of  May,  1991,  called  for  a  team  structure  that  provided  for  partial  functional 
overlap  among  the  decision  makers.  Each  subordinate  decision  maker  was  to  have  the 
ability  to  probe  for  measurements  on  two  of  the  three  contact  attributes,  (size,  altitude 
rate  and  radar  emission  type),  with  primary  responsibility  in  one  and  secondary 
responsibility  in  the  other  [Ref.  3:p.  16].    This  was  to  allow  for  the  gathering  of  data 


relating  to  how  stress  affects  team  coordination  and  burden  sharing.  However,  the 
eventual  team  structure  actually  used  in  the  December,  1991,  experiment  did  not  provide 
the  overlap  [Ref.  12].  Each  subordinate  had  access  to  only  one  of  the  contact's  three 
attributes.  No  horizontal  coordination  was  required  or  possible  in  completing  subordinate 
tasks.  Each  subordinate  team  member  obtained  noisy  measurements  on  one  of  the 
attributes  by  using  a  mouse  to  position  a  cursor  over  the  target  icon.  When  the  mouse 
was  clicked,  a  window  was  displayed  and,  after  ten  seconds,  a  measurement  of  the 
attribute  was  displayed.  Occasionally,  no  information  was  provided.  A  tick  mark  (-) 
taking  its  place.  The  frequency  of  this  information  loss  was  manipulated  as  an 
independent  variable  and  named  "uncertainty".  After  a  team  member  had  collected 
enough  readings  to  determine  an  attribute's  value,  this  value  was  passed  verbally  to  the 
TAO,  and  manually  entered  in  the  subordinate's  computer  log  along  with  the  subjective 
confidence  in  the  current  value  assigned  to  the  attribute. 

The  job  of  the  TAO  was  to  fuse  the  attribute  information  provided  by  the 
subordinates  and  make  a  determination  as  to  the  contact's  hostility.  He  not  only  received 
verbal  reports  from  the  three  subordinates,  but  was  also  able,  with  his  mouse,  to  open 
a  window,  (see  Figure  1),  that  displayed  each  of  the  three  subordinates'  most  recent 
attribute  values  and  confidence  levels  as  entered  in  their  personal  logs.  (Note.  There  was 
a  ten  second  "communications  delay"  between  the  time  a  subordinate  made  an  entry  and 
the  TAO's  version  was  updated.) 

However,  the  TAO  did  not  have  direct  access  to  sensors.  He  had  to  verbally  task 
one  or  more  of  the  subordinates  to  provide  additional  attribute  estimates  or  raw  data  as 


required.  The  TAO  was  to  make  a  hostility  determination  based  on  current  information. 
The  dissemination  of  this  opinion  every  45  seconds  was  manipulated  as  an  independent 
variable  called  "feedback". 

The  TAO  had  to  make  a  final  determination  of  the  inbound  track's  hostility  before 
it  entered  the  protected  zone  of  the  carrier.  This  final  determination,  or  the  contact 
entering  the  protected  zone  ended  the  trial  and  all  four  team  members  received  feedback 
as  to  the  correctness  of  their  call.    This  feedback  was  not  manipulated. 

B.      SETUP 

1.  Physical 

The  physical  setup  of  the  experiment  consisted  of  four  physically  separate 
bays,  each  containing  a  single  game  station.  The  purpose  in  separating  the  stations  was 
to  ensure  that  all  communications  would  be  either  via  voice  net,  or  via 
the  software,  and  hence  recorded.  The  experiment  was  hosted  on  the  DDD-II  simulator 
using  software  developed  at  the  University  of  Connecticut  and  SUN  workstations 
connected  by  a  local  ETHERNET.  Each  game  station  consisted  of  a  graphics  display, 
a  keyboard,  a  mouse  and  an  intercom  headset  provided  by  NTSC  Orlando. 

2.  Test  Subjects 

The  test  subjects  included  nineteen  junior  to  field  grade  military  officers  and 
one  civilian.  The  twenty  subjects  were  drawn  from  the  Joint  Command,  Control, 
Communications  (JC3)  curriculum  at  the  Naval  Postgraduate  School  in  Monterey, 
California.    The  subjects  were  divided  into  five  teams  of  four  members.    Operational 
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Figure  1:  Sample  SAINT  Display 
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experience  was  considered  both  in  selecting  the  teams  and  in  assigning  the  TAOs.  Team 
cohesion  was  maintained  throughout  the  experiment. 

3.      Special  Equipment 

Special  equipment  included  a  VHS  recorder  and  the  intercom  headsets  with 
related  communications  equipment.  The  audio  signal  from  the  communications  net  was 
patched  directly  into  the  VHS  recorder.  The  TAO's  game  screen  and  all  verbal 
communications  were   recorded  in  this  manner. 

C.      HYPOTHESES 

The  purpose  of  this  thesis  is  to  identify  actions  or  behaviors  that  contribute  to  the 
CIC  A  AW  team  performance  under  conditions  of  high  stress  induced  by  time  pressure, 
uncertainty  and  ambiguity.  In  narrowing  the  emphasis,  leader  feedback  was  selected  as 
an  area  with  possible  implications  in  effective  team  training.  The  following  hypotheses 
are  based  on  the  research  questions  and  literature  survey  discussed  in  Chapter  I: 

1.  Hypothesis  I: 

Leader  feedback  of  current  hostility  assessment  lowers  explicit  coordination. 

2.  Hypothesis  H: 

Leader  feedback  of  current  hostility  assessment  lowers  subjective  workload. 

3.  Hypothesis  HI: 

Leader  feedback  of  current  hostility  assessment  lowers  error  rates. 
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4.      Hypothesis  IV: 

Leader  feedback  of  current  hostility  assessment  changes  the  error  pattern. 

D.      ASSUMPTIONS 

1.  General 

The  major  assumption  made  during  this  experiment  was  that  the  learning  curve  was 
completed  during  the  two  training  sessions,  and  that  the  data  is  therefore  free  from  any 
effects  due  to  the  learning  curve.  Another  assumption  was  that  the  subjects  were  willing 
and  enthusiastic,  and  that  the  data  is  therefore  not  tainted  by  halfhearted  guessing  on  the 
part  of  the  TAO  or  his  staff.  This  assumption  is  necessary  because  subjects  were  not 
volunteers. 

2.  Simplifying  Assumptions 

In  addition  to  the  general  assumptions  outlined  above,  there  were  several 
simplifying  assumptions  that  divorce  the  experimental  paradigm  from  reality  but  are 
necessary  to  gain  some  control  over  the  manipulation  of  the  selected  independent 
variables.  These     include: 

•  Strictly  an  A  AW  problem; 

•  No  multi-target  tracking; 

•  Semi-artificial  roles  for  subordinates; 

•  Single  intra-team  communications  net; 

•  No  inter-team  (i.e.,  between  platforms)  communications; 
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•  Simulates  only  a  small  portion  of  total  CIC  personnel,  activities,  noise  and 
confusion. 


E.      STATISTICAL  DESIGN 

The  experiment  was  designed  to  yield  balanced  data  for  ANOVA  purposes.  Four 
independent  variables  were  part  of  this  design.  The  first  three  resulted  in  12  different 
possible  combinations  (3  time  stress  *  2  uncertainty  *  2  feedback).  Each  combination 
was  presented  twice,  for  a  total  of  24  presentations  per  team.  The  four  levels  of 
ambiguity  were  manipulated  evenly  over  each  group  of  12  presentations.  The  24 
presentations  were  run  on  five  separate  teams,  yielding  120  data  points  for  performance 
measures.    The  four  independent  variables  are  outlined  below: 

1.  Time  Induced  Stress 

Stress  due  to  increasing  time  compression  was  manipulated  at  three  levels: 

•  Low  =  6  minute  prosecution  window; 

•  Medium  =  4  minute  prosecution  window; 

•  High  =  2  minute  prosecution  window. 

2.  Uncertainty 

As  discussed  earlier,  this  was  a  measure  of  how  often  a  probe  resulted  in  no 
data: 

•  Low  =  10%  garbled  data  (ticks); 

•  High  =  50%  garbled  data  (ticks). 
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3.  Feedback  of  Hostility  Assessment 

Under  "feedback"  conditions,  the  TAO  was  required  to  verbally  disseminate 
his  current  opinion  as  to  the  inbound  track's  hostility  or  neutrality  every  45  seconds. 
Under  "no  feedback"  conditions,  he  was  prohibited  from  ever  disseminating  his  opinion 
of  hostility  or  neutrality.  All  instances  of  improper  dissemination  were  recorded  by  the 
observer. 

4.  Ambiguity 

Ambiguity  was  manipulated  as  the  fourth  independent  variable.  This  was  a 
measure  of  how  clearly  hostile  or  clearly  neutral  the  target  profile  was,  (see  Figure  2). 
One  half  the  profiles  were  ambiguous  based  on  the  general  definition  of  "hostile"  given 
to  participants.  This  is  further  broken  down  as  follows:  one  fourth  clearly  neutral;  one 
fourth  ambiguous  neutral;  one  fourth  ambiguous  hostile;  and  one  fourth  clearly  hostile. 

F.       MEASURES 

1.      General 

The  experiment  included  both  qualitative  and  quantitative  measures. 
Qualitative  measures  included:  pre-experiment  questionnaires  to  measure  team 
preparation  and  coordination;  pre-presentation  predictions  of  hostility  recorded  for  all 
subjects;  subjective  workload  assessments  after  each  presentation;  subjective  performance 
evaluation  questionnaires  after  each  block  of  six  presentations;  and  a  post  experiment 
questionnaire. 
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Figure  2:   Target  Trajectories 
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Quantitative  measures  were  recorded  automatically  by  the  DDD-II  simulator. 
These  electronically  recorded  measures  number  over  forty  and  range  from  the  TAO's 
final  decision  to  the  number  of  times  each  subordinate  probed  the  contact  for  attribute 
information.  The  SWAT  data  was  also  recorded  by  the  simulator,  (see  next  section), 
both  for  the  four  individual  decision  makers  and  the  team.  Quantitative  measures  of 
verbal  communication  rates  and  types  were  recorded  manually  by  observers  with  tally 
sheets. 

2.      Workload  Assessment 

In  order  to  assess  workload  for  the  purpose  of  testing  the  viability  of  the 
stressors,  each  participant  completed  the  Subjective  Workload  Assessment  Technique  or 
SWAT  [Ref.  13:pp.  403-406].  SWAT  consists  of  two  phases.  Phase  one  should  be 
carried  out  prior  to  the  data  collection  part  of  the  experiment.  Each  participant  performs 
a  card  sort  to  develop  a  unique  workload  scale.  Each  card  contains  a  different 
combination  of  the  three  workload  dimensions:  timeload;  mental  effort  load;  and 
psychological  stress  load  [Ref.  13].  Each  dimension  has  three  levels:  low;  moderate; 
and  high.  Crossing  dimensions  with  levels  yields  27  possible  combinations  which  are 
rank  ordered  by  the  participant  according  to  the  workload  described.  It  should  be  noted 
here  that  this  phase  was  completed  after  data  collection  for  SAINT,  December,  1991. 

The  second  phase  occurred  during  data  collection.  At  the  end  of  each 
presentation,  subjects  rated  the  workload  they  had  just  experienced  based  on  the  same 
dimensions  and  levels  described  above,  (eg.  321  would  represent  high  time  load, 
moderate  mental  effort,  and  low  psychological  stress).    Software,  developed  by  G.  M. 
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Reid,  ALPHATECH  INC,  then  converts  these  numbers  to  a  percent  workload  score 
based  on  the  unique  workload  scale  developed  for  that  participant  in  phase  one.  Zero 
percent  represents  very  low  workload;  100  percent  represents  very  high  workload, 
(unique  to  the  individual). 

At  the  writing  of  this  thesis,  converted  SWAT  data  was  not  available.  Means 
from  phase  1  had  to  be  utilized.  These  means  are  typically  highly  correlated  with  the 
converted  SWAT  percentages.    [Ref.  14] 

3.      Subjective  Hostility  Assessment 

In  order  to  determine  the  subjective  definition  of  hostility  for  each  TAO,  so 
that  team  performance  measures  could  be  adjusted  accordingly,  TAOs  sorted  a  set  of 
hostility  cards  similar  to  the  SWAT  card  sort.  There  were  27  cards  reflecting  the  three 
target  attributes,  (size,  altitude  rate,  and  radar  emission),  and  the  three  levels  within  each 
attribute,  (small,  medium,  large;  climbing,  level,  descending;  no  emission,  search  radar, 
fire  control  radar).  In  this  manner,  each  TAO's  24  final  decisions  can  be  compared  to 
his  own  definition  of  hostility,  as  well  as  the  "ground  truth"  definition  of  the  paradigm. 

The  design  of  the  December,  1991  SAINT  experiment  is  sound.  The 
assumptions  made  were  reasonable,  and  the  statistical  design  should  provide  balanced 
data.    A  more  detailed  description  of  the  actual  data  is  provided  in  the  next  chapter. 
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m.   DATA  DESCRIPTION 

A.  TYPES 

As  well  as  being  both  quantitative  and  qualitative,  the  data  was  collected  both 
manually  and  electronically.  Manually  collected  data  included  questionnaires,  card  sorts, 
and  observation  form  "tally  sheets",  as  well  as  the  audio/video  tape  of  each  presentation. 
Electronically  collected  data  included  the  quantitative  measures  collected  by  the 
computer. 

B.  PROBLEMS 

1.  Electronically  Collected  Data 

There  were  no  problems  with  the  electronically  collected  data.  Complete  data 
for  all  24  presentations  on  all  five  teams  were  collected.  In  addition,  complete  SWAT 
survey  data  was  collected  for  all  120  runs. 

2.  Manually  Collected  Data 

There  were  some  actual  as  well  as  some  potential  problems  in  the  manual 
collection  of  data.  The  video/audio  tape  was  not  started  at  the  beginning  of  all  runs. 
This  affected  three  of  120  runs.   However,  partial  runs  were  recorded  in  all  three  cases. 

Potential  problems  lay  in  the  fact  that  the  observation  forms  for 
communications  analysis  can  be  interpreted  differently  by  each  observer,  (see  Appendices 
A  and  B).   The  categories  are  too  broad,  and  much  data  may  have  been  lost  or  skewed 
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because  an  observer  could  not  force  a  comment  to  fit  one  of  them,  either  letting  it  go  or 
placing  it  where  it  "best  fit".  Some  effort  was  made  to  prevent  the  inherent  variance 
from  one  observer  to  another.  In  the  instance  of  TAO  communications,  the  same 
observer,  (the  author),  recorded  all  24  presentations  for  all  five  teams.  This  was  not 
possible,  from  a  practical  standpoint,  in  the  case  of  subordinate  communications.  A  total 
of  seven  different  observers,  rotating  between  teams,  recorded  subordinate 
communications.  Additionally,  some  categories  are  simply  not  needed  based  on  the 
experimental  paradigm.  For  example,  information  transfers  of  raw  data  would  never  be 
made  by  the  TAO,  except  to  call  the  original  target  of  interest.  Indeed,  this  should  be 
a  category  of  its  own.  Careful  modifications  to  the  data  collection  forms  could  reduce 
the  confusion  for  observers  as  well  as  reduce  the  amount  of  potentially  lost  or  skewed 
data. 

C.      DATA  CODING  SCHEME 
1.     Manually  Collected  Data 

a.      Observation  Forms 

Appendix  C  contains  the  coding  scheme  for  data  collected  manually  with 
observation  forms.  Two  types  of  observation  forms  were  used:  one  for  the  TAO, 
(Appendix  A),  and  one  for  subordinates,  (Appendix  B).  These  forms  illustrate  the  areas 
of  interest  in  data  collection.  The  data  collection  method  simply  required  the  observers 
to  keep  a  tally  of  all  instances  of  communication  made  by  the  test  participants.  The 
forms  contain  separate  blocks  for  each  of  the  varied  types  of  communications  of  interest 
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to  the  experimenters.  For  situations  not  specifically  addressed  by  the  form,  a  comments 
section  is  provided.  The  problems  with  this  type  of  data  collection  were  discussed 
above. 

b.      Questionnaires 

Appendix  D  contains  the  data  coding  scheme  for  data  collected  manually 
using  questionnaires.  An  example  of  a  questionnaire  is  seen  in  Appendix  E.  This 
particular  questionnaire  was  given  to  participants  after  each  block  of  six  presentations  to 
solicit  opinions  and  assessments  concerning  mission  accomplishment,  team  and  individual 
performance  and  goal  achievement. 

2.      Electronically  Collected  Data 

Appendix  F  contains  the  coding  scheme  used  for  electronically  collected  data. 
Appendix  G  is  an  example  of  this  raw  data  as  extracted  from  the  computer  after  the 
experiment. 

Although  both  realized  and  potential  problems  occurred  in  data  collection,  a 
set  of  balanced  data  was  produced  for  purposes  of  analysis  of  variance.  A  description 
of  this  analysis  is  contained  in  the  next  chapter. 
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IV.    ANALYSIS 

A.      METHODOLOGY 

After  all  120  trials  had  been  run,  data  was  processed  by  ALPHATECH  INC.  The 
dependent  variables  were  first  organized  into  three  sets,  each  with  various  categories. 
Set  #B1  includes  semi -processed  data  collected  on-line  by  the  DDD-II  simulator,  and  was 
broken  into  performance,  strategy,  and  workload  categories.  Set  #B2  includes  semi- 
processed  data  collected  by  observers.  It  contains  communications  data  on  the  TAO  and 
subordinate  decision  makers.  Set  #B3  includes  semi-processed  data  collected  from 
subjects  and  contains  data  from  the  questionnaires. 

The  next  step  by  ALPHATECH  INC  was  to  formulate  a  set  of  aggregated  measures 
with  categorization,  based  on  variable  sets  B1-B3.  This  set  is  categorized  #A1.  The 
coding  scheme  for  aggregated  measures  set  Al  is  contained  in  Appendix  H.  The  data 
was  then  evaluated  by  subpopulation  based  on  this  coding  scheme.  Means  tables  were 
generated  for  each  subpopulation,  (dependent  variable  by  independent  variable,  eg.  API 
by  feedback,  API  by  uncertainty,  etc.).  Analysis  of  variance  (ANOVA)  was  performed 
for  all  dependent  variables. 

ANOVA  generates  a  "p"  value.  This  value  is  the  probability  of  making  an  error 
in  claiming  that  a  given  dependent  variable  is  affected  differently  by  different  levels  of 
an  independent  variable.   The  standard  acceptable  value  is  p<  =0.05.   Another  way  to 
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view  a  value  of  p<  0.05  would  be  to  say,  "I  can  be  95%  certain  that  the  change  in 
dependent  variable  X  was  caused  by  independent  variable  Y  and  not  by  chance. " 

B.      RESULTS 

This  thesis  will  only  provide  results  from  the  data  analysis  that  are  pertinent  to  the 
research  questions  stated  in  Chapter  I. 

1.  Workload  Assessment 

It  was  expected  that  uncertainty,  time-stress  and  ambiguity  would  all  increase 
subjective  workload.  This  is  the  case,  (see  Figures  3-5),  however,  only  uncertainty  had 
a  statistically  significant  effect,  (p  <  0.045).  The  mean  workload  for  the  entire  population 
was  only  1.3459,  (1  =low,  2  =  moderate,  3=high).  Under  the  most  stressful  conditions, 
(high  uncertainty,  high  workload,  high  ambiguity),  workload  was  reported  as  1.7709. 
This  is  barely  "moderate".  Clearly,  more  stressors  need  to  be  introduced  to  the 
experimental  paradigm. 

2.  Effects  of  Feedback 

a.      On  Communication  Rates 

It  was  expected  that  TAO  feedback  of  his  opinion  as  to  the 
hostility/neutrality  of  the  inbound  contact  of  interest  would  lower  communication  rates 
by  facilitating  implicit  coordination  through  an  expanded  shared  mental  model. 
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Figure  3:    Workload  Versus  Uncertainty 
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This  does  not,  at  first  glance,  appear  to  be  the  case.  Feedback  actually  increased 
overall  message  rate,  (messages  per  minute),  from  7.6667  without  feedback,  to  8.4667 
with  feedback,  (p<  0.046).   However,  this  is  deceptive.    By  forcing  the  leader 
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Figure  4:   Workload  Versus  Time-Pressure 

to  communicate  every  45  seconds  in  the  feedback  condition,  we  artificially  raise  his 
communication  rate  to  subordinates  from  1.4833  to  2.5167,  (see  Figure  6,  p< 0.002). 
If  we  look  at  subordinate  communications  to  the  TAO,  a  factor  not  artificially  altered  by 
manipulating  the  independent  variable,  we  see  the  mean  percentage  of  communications 
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Figure  5:   Workload  Versus  Ambiguity 
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Figure  6:   TAO  To  Subordinate  Communications 
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fall  from  77.533  percent  to  62.3333  percent  under  feedback,  (see  Figure  7,  p< 0.005). 
Horizontal  communications  were  not  affected  by  feedback,  (p< 0.777),  but  this  is 
expected  in  that  the  experimental  paradigm  does  not  require  nor  encourage 
communication  between  subordinates.  Feedback  does  not  change  this  fact.  On  the 
whole,  feedback  would  seem  to  lower  explicit  coordination. 

If  we  look  at  the  mean  number  of  information  requests  made  by  the 
TAO,  we  see  that  they  drop  from  a  mean  of  2.5167  per  presentation  with  no  feedback, 
to  2.0000  per  presentation,  with  feedback,  (p< 0.044).  Another  interesting  measure  is 
the  "anticipation  ratio".  This  is  the  difference  between  information  transfers  to  TAO  and 
information  requests  from  TAO,  that  difference  then  divided  by  the  information  transfers 
to  TAO,  expressed  as  a  percent.  This  anticipation  ratio  increases  from  85.75%,  with  no 
feedback,  to  87.4333%  with  feedback,  (however,  p< 0.325).  Taken  together,  the  drop 
in  information  requests  and  the  rise  in  the  anticipation  ratio  seem  to  indicate  that 
feedback  does  play  a  role  in  implicit  coordination. 

b.      On  Subjective  Workload 

It  was  expected  that  feedback  would  lower  subjective  workload.  This 
was  not  the  case.  Workload  under  no  feedback  conditions  was  1.3425  and  actually 
increased  very  slightly  under  feedback  to  1.3493.  These  numbers  are  obviously  nearly 
the  same  (standard  deviation  0.4),  and  the  p  value  is  not  significant. 
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Feedback  also  had  no  effect  under  the  high  time  pressure  condition,  which  was  predicted 
to  be  most  likely  to  show  effects  under  feedback. 

c.  On  Error  Rate 

No  predictions  were  made  concerning  the  effects  of  feedback  on  error 
rate.  The  mean  error  rate,  (according  to  ground  truth  in  the  paradigm),  increased  from 
15.0  percent  with  no  feedback,  to  28.33  percent  with  feedback,  (p< 0.078).  When  we 
adjust  the  data  for  TAOs'  subjective  hostility  definition,  the  error  rate  jumps  to  26.67 
percent,  and  is  not  affected  by  feedback,  (p<1.0).  Feedback  of  TAO  opinion  as  to 
contact  hostility  has  a  negative  impact  on  team  performance.  This  is  seen  graphically  in 
Figure  8. 

d.  On  Error  Pattern 

No  predictions  were  made  concerning  the  effects  of  feedback  on  the 
error  pattern,  (false  alarm  rate  versus  miss  rate).  As  seen  earlier,  the  overall  error  rate 
nearly  doubles  under  feedback.  Did  the  error  pattern  change  as  well?  It  would  appear 
not,  (see  Figure  9).  As  expected,  the  false  alarm  rate  and  the  miss  rate  are  both  affected 
by  feedback,  (p< 0.099).  Under  both  conditions,  the  false  alarm  rate  is  larger  than  the 
miss  rate,  and  by  nearly  the  same  proportion.  However,  when  examined  closely,  it  is 
seen  that 
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under  the  no  feedback  condition,  the  miss  rate  is  only  80  percent  as  large  as  the  false 
alarm  rate.  Under  the  feedback  condition,  the  miss  rate  rose  to  88  percent  as  large  as 
the  false  alarm  rate.  Feedback  may  have  an  effect  on  the  error  pattern  that  is  too  subtle 
to  see  readily  with  the  relatively  small  population  of  120  final  decisions.  Added  evidence 
to  this  effect  may  be  seen  when  we  look  at  the  TAO  initial  judgment.  Under  the  no 
feedback  condition,  this  was  1.4167  (1  =  neutral,  2  =  hostile).  Under  feedback,  this 
number  drops  to  1.3167,  (p< 0.109).  Figure  10  shows  this  in  raw  percentages.  From 
this  figure,  it  is  clear  that  under  no-feedback  conditions,  TAOs  initially  report  neutrals 
and  hostiles  at  about  the  same  rate.  When  TAOs  provide  feedback,  they  report  neutrals 
at  a  2:1  ratio  over  hostiles.  When  compared  to  the  error  patterns,  we  see  that  as  the 
initial  judgement  gets  closer  to  neutral,  the  miss  rate  increases  as  a  proportion  of  the 
false  alarm  rate.  In  other  words,  as  feedback  drives  the  initial  judgment  towards  neutral, 
(from  1.4  to  1.3),  TAOs  are  more  likely  to  call  a  hostile  contact  neutral,  at  final 
decision,  than  under  the  no  feedback  conditions. 
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TAOS'  INITIAL  JUDGMENTS 
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e.      The  Question  of  Confirmatory  Bias 

As  discussed  in  Chapter  I,  feedback  may  have  some  influence  on 
confirmatory  bias.  There  is  some  evidence  of  this  in  the  two  preceding  sections.  It  was 
stated  earlier  that  feedback  increases  the  error  rate.  It  also  lowers  the  probe  rate  of  the 
contact  by  subordinates  from  .1815  probes  per  second  to  .1722  probes  per  second, 
(p<  0.037).  Furthermore,  feedback  increases  slack  time,  (time  remaining  at  final 
decision),  from  23.25  seconds  with  no  feedback,  to  28.7333  seconds  with  feedback, 
(however,  p<  0.312).  Confidence  on  final  judgement,  (l:low,  2:moderate,  3:high),  also 
increased  from  1.6167  with  no  feedback  to  1.6333  with  feedback,  (p< 0.034).  When 
combined,  these  factors  seem  to  indicate  a  trend,  under  feedback,  of  a  willingness  to 
make  the  wrong  decisions  more  quickly  with  less  information  yet  with  increased 
confidence.  This  seems  to  indicate  that  feedback  of  the  leader's  current  situation 
assessment  as  to  the  inbound  contact's  hostility  contributes  to  confirmatory  bias,  which 
in  turn  reduces  overall  performance.  Further  evidence  is  seen  in  the  fact  that 
confirmatory  bias,  if  it  is  caused  by  feedback,  would  predict  the  slight  change  in  error 
pattern  caused  by  the  change  in  initial  hostility  judgement  as  discussed  in  the  preceding 
section. 
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V.   CONCLUSIONS 

The  purpose  of  this  chapter  is  to  draw  conclusions  about  the  four  hypotheses  of  the 
thesis.  They  will  be  made  based  on  the  data  analysis  discussed  above.  With  regard  to 
the  first  hypothesis,  it  is  concluded  that  feedback  does  lower  explicit  coordination.  There 
is  also  some  evidence,  though  not  statistically  significant  (p<.325),  that  feedback 
increases  anticipation.  It  is  not  clear  whether  or  not  the  cause  of  this  is  an  enhanced 
shared  mental  model.   Further  study  should  be  done  in  this  area. 

With  regard  to  the  second  hypothesis,  there  is  little  evidence  that  feedback,  in  the 
narrow  definition  of  the  experimental  paradigm,  lowers  subjective  workload.  This  should 
be  looked  at  when  the  converted  SWAT  percentages  are  available,  and  should  be  studied 
again  under  conditions  of  truly  high  stress. 

With  regard  to  the  third  hypothesis,  it  can  not  be  concluded  that  leader  feedback 
lowers  error  rates.  Indeed,  there  is  strong  evidence  to  suggest  that  it  increases  error 
rates.  The  adjustment  of  the  data  for  subjective  hostility  definition  was  inconclusive;  all 
p  values  jumped  to  1.0.  This  indicates  a  problem  in  the  method  of  obtaining  this 
subjective  definition  that  should  be  addressed  prior  to  the  next  experiment.  It  is  touched 
on  briefly  in  the  next  section. 

With  regard  to  the  fourth  hypothesis,  there  is  some  evidence  that  feedback  may 
have  affected  the  error  pattern.  However,  the  evidence  is  not  strong,  and  further 
research  should  be  done  in  this  area. 
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In  addition  to  the  four  hypothesis,  there  is  evidence  that  feedback  contributes  to 
confirmatory  bias,  and  as  a  result,  lowers  performance.  On  balance  there  seems  to  be 
little  to  recommend  feedback  of  this  nature  in  situation  assessment. 


37 


VI.    RECOMMENDATIONS 

A.      FOR  FUTURE  SAINT  EXPERIMENTS 

1.      Stressors 

While  the  SWAT  data  indicate  that  the  stressors  utilized  by  the  experimental 
paradigm  had  the  expected  effects  on  subjective  workload,  it  is  clear  that  situation 
assessment  under  truly  high  levels  of  stress  was  not  observed.  The  mean  subjective 
workload  was  only  1.3459,  (3=high  stress).  Under  the  most  stressful  conditions,  (high 
time  pressure,  high  uncertainty,  high  ambiguity),  the  mean  subjective  workload  was  only 
1.7709.   More  realism,  and  as  a  result  more  stress,  must  be  introduced. 

The  first  way  to  do  this  would  be  to  add  a  secondary  and  even  a  tertiary  task. 
Keeping  an  externally  located  superior  informed,  (manipulated  by  superior  queries/does 
not  query),  is  one  possibility.  Another  is  making  appropriate  warnings  to  the  unknown 
inbound  aircraft,  (manipulated  by  a  screen  prompt).  Another  way  to  increase  stress 
would  be  to  increase  contact  attributes.  For  example,  have  another  decision  maker 
determine  if  it  is  in  a  designated  commercial  airway,  and  have  another  probe  for  Identity 
Friend  or  Foe  (IFF)  readings.  Adding  attributes  would  not  only  increase  the  TAO 
decision  matrix,  but  it  would  also  increase  the  stress  on  the  communications  circuit  by 
adding  more  users.  A  final  way  to  increase  the  stress  on  all  four  players  would  be  to 
eliminate  the  "highlight"  on  the  target  icon,  making  more  than  one  of  the  "clutter"  tracks 
potential  targets. 
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2.  Data  Collection 

As  discussed  in  the  section  devoted  to  data,  the  observation  forms  should  be 
revised  to  better  reflect  the  experimental  paradigm,  and  the  categories  should  be  more 
specific.  Additionally,  all  four  observers  should  have  headsets  so  they  can  hear  requests 
as  well  as  responses.  Much  data  may  have  been  lost  because  the  observers  could  only 
hear  one  side  of  communications.  For  example,  "Roger."  may  be  classified  as  no 
information  transfer,  a  transfer  of  raw  data  or  a  transfer  of  an  opinion  on  hostility 
depending  on  the  question  or  statement  which  prompted  the  response. 

3.  Subjective  Hostility 

The  hostility  card  sort  should  be  done  before  data  collection,  after  12 

presentations,  and  after  the  last  presentation.    This  is  recommended  because  there  is 

evidence  that  subjective  definitions  of  hostility  changed  throughout  the  experiment.  This 

would  have  been  predicted  by  Kathryn  Blackmond  Laskey,  who  states  in  a  study  on 

assessing  preferences   in   the  presence  of  random  response  error: 

There  are  four  general  approaches....  The  first  is  simply  to  ignore  the  problem, 
treating  the  decision  maker's  responses  as  if  they  were  error  free.  The  second  is 
to  average  multiple  judgments  concerning  the  value  or  utility  of  each  outcome.  If 
the  response  errors  are  interdependent  of  one  another,  this  averaging  strategy  will 
produce  more  reliable  preference  assessments.  The  third  approach... is  to  employ 
consistency  checks  by  including  logically  interdependent  judgments  in  the 
preference  assessment  task.  If  inconsistencies  arise,  the  decision  maker  is  asked 
to  resolve  them.  In  the  process,  decision  analysts  argue,  the  decision  maker  will 
gain  insight  into  his  own  preferences,  and  discover  his  true  preferences.  The 
fourth  approach  to  the  problem  of  response  error  is  to  fit  preference  models  to  the 
decision  maker's   responses.    [Ref.  15:p.  996] 

For  the  December  experiment,  it  seems  Laskey' s  first  approach  was  used.    The 

card  sort  should  be  done  in  the  second  recommended  manner  in  order  to  capture  any 
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trends,  so  that  the  performance  data  can  be  adjusted  accordingly.  When  the  sort  is  done 
only  once,  as  in  the  December,  1991  experiment,  the  adjusted  error  rates  increase  instead 
of  decrease,  and  all  associated  p  values  jump  to  1.0.  This  is  because  we  are  applying 
a  single  subjective  definition  of  hostility  to  all  24  presentations,  when  that  definition 
probably  changed  more  than  once. 

4.      For  Further  Study 

As  discussed  earlier,  under  the  no  feedback  condition,  TAOs  initially  reported 
neutrals  and  hostiles  at  the  same  rate.  However,  under  feedback  conditions,  they 
reported  neutrals  at  a  2:1  ratio  over  hostiles.  Are  TAOs  anticipating  a  subordinate  bias 
towards  hostile  and  unconsciously  trying  to  adjust  for  this  "framing"?  Further  study  is 
needed  to  answer  this  question. 

B.      For  Naval  Team  Training 

The  results  of  this  first  SAINT  experiment  would  seem  to  indicate  that  Navy  team 
trainers  should  discourage  the  feedback  of  the  CO/TAO  opinion  of  an  inbound  contact's 
hostility  or  neutrality  while  the  situation  is  still  being  assessed.  While  feedback  may  help 
in  facilitating  implicit  coordination  by  extending  the  shared  mental  model,  feedback  of 
this  specific  nature  extends  the  model  too  far.  The  goal  of  the  CIC  AAW  team  is  to 
assess  an  often  confusing,  uncertain  situation.  By  giving  feedback  on  his  current 
assessment,  the  leader  compounds  the  negative  phenomenon  of  confirmatory  bias,  and 
the  error  rate  increases.  Sharing  what  are  essentially  predictions  of  the  final  decision, 
biases  the  team  and  reduces  performance. 
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There  is  no  room  for  bias  in  today's  uncertain  world  where  correct  and  timely 
situation  assessment  can  mean  the  prevention  of  the  loss  of  innocent  life,  and  the 
avoidance  inaction  leading  to  tragedy  and  disgrace.  Indeed,  it  can  mean  the  difference 
between  war  and  the  avoidance  of  war,  and  is  therefor  a  critical  aspect  of  command  and 
control. 
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APPENDIX  A 


"SAINT  EXPERIMENT:  NPS  Nov  -  Dec  91 
OBSERVATION  FORM  FOR  TAO 


Team*    1    2    3 
4    5    6 


Trial* 


Date/Time 
Observer  . 


TAO  TO: 

Type 

Subordinate  1 

Subordinate  2 

Subordinate  3 

All 

1.  Hostile/ 
Friendly 

• 

m 

Information 

2.  Judgment 

and 
Confidence 

| 

3.  Raw  Data 

1.  Hostile/ 
Friendly 

Information 
Requests 

1  Judgment 

and 
Confidence 

3.  Raw  Data 

l.Team 
Bolstering 

M 

Others 

2.  Ac*ions 
Requests 

Failures 

Gave  Feedback 
when  shouldn't 

Didn't  Give 
Feedback  when 
Should 

Time  of  1st 

Hostility 

Judgment 

Additional  Notes  on  this  Trial: 
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"SAINT"  EXPERIMENT:  NPS  Nov -Dec  91 
OBSERVATION  FORM  FOR  SUBORDINATES 


Team*     1     2    3 
4    5     6 


Trial  # 


Date/Time 
Observer  _ 


Type 

Subordinate  to 
Subordinate 

Subordinate  to 
TAO 

Subordinate  to 
Team 

1.  Hostile/ 
Friendly 

Information 

RCuuC9t9 

2.  Judgment 

and 
Confidence 

3.  Raw  Data 

1.  Hostile/ 
Friendly 

Information 
Requests 

2.  Judgment 

and 
Confidence 

3.  Raw  Data 

l.Team 
Bolstering 

Others 

2.  Actions 
Requests 

Additional  Notes  on  this  Trial: 
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SAINT  EXPERIMENT  TTADMUS  .13299) 

Dependent  Variables 

Set  #  B2:  semi-processed  data  collected  on-line  by  observers 

L  COMMUNICATION: 
LI  Leader  (TAO) 

INFORMATION  COMMUNICATION 

CI.  Total  number  of  information  transfers  from  leader  to  subordinates  (L/S) 
C2.  Number  of  L/S  opinion  (hostile/neutral  judgment  &  confidence)  transfers 
C3.  Number  of  L/S  processed  info,  (specialized  judgment  &  confidence)  transfers 
C4.  Number  of  L/S  raw  data  transfers 

C5.  Total  number  of  information  requests  from  leader  to  subordinates  (L/S) 
C6.  Number  of  L/S  opinion  (hostile/neutral  judgment  &  confidence)  requests 
C7.  Number  of  L/S  processed  info,  (specialized  judgment  &  confidence)  requests 
C8.  Number  of  L/S  raw  data  requests 

OTHER  COMMUNICATION 

C9.  Total  number  of  feedback  errors 

CIO.  Number  of  times  leader  gave  feedback  when  shouldn't  have 

CI  1.  Number  of  times  leader  didn't  give  feedback  when  should  have 

CI 2.  Number  of  bolstering  comments  to  subordinates 

CI 3.  Number  of  action  requests  (other  than  above)  by  leader  to  subordinates 

TIMELINESS* 

C14.  Latency  of  leader's  first  judgment  [sees] 


Although  not  a  communication  measure,  this  latency/delay  measure  was  recorded  by  the  TAO's  observer 

2-1 


ALPHATECH,  INC. 


12  Subordinates 

INFORMATION  COMMUNICATION 

CI 5.  Total  number  of  information  transfers  among  subordinates  (S/S) 
CI 6.  Number  of  S/S  opinion  (hostile/neutral  judgment  &  confidence)  transfers 
C17.  Number  of  S/S  processed  info,  (specialized  judgment  &  confidence)  transfers 
CI 8.  Number  of  S/S  raw  data  transfers 

C19.  Total  number  of  information  requests  among  subordinates  (S/S) 
C20.  Number  of  S/S  opinion  (hostile/neutral  judgment  &  confidence)  requests 
C21.  Number  of  S/S  processed  info,  (specialized  judgment  &  confidence)  requests 
C22.  Number  of  S/S  raw  data  requests 

C23.  Total  number  of  information  transfers  from  subordinates  to  leader  (S/L) 

C24.  Number  of  S/L  opinion  (hostile/neutral  judgment  &  confidence)  transfers 

C25.  Number  of  S/L  processed  info,  (specialized  judgment  &  confidence)  transfers 

C26.  Number  of  S/L  raw  data  transfers 

C27.  Total  number  of  information  requests  from  subordinates  to  leader  (S/L) 

C28.  Number  of  S/L  opinion  (hostile/neutral  judgment  &  confidence)  requests 

C29.  Number  of  S/L  processed  info,  (specialized  judgment  &  confidence)  requests 

C30.  Number  of  S/L  raw  data  requests 

OTHER  COMMUNICATION 

C3 1 .  Number  of  bolstering  comments  among  subordinates 

C32.  Number  of  action  requests  (other  than  above)  among  subordinates 
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SAINT  EXPERIMENT  fTADMl  IS  13299) 

Dependent  Variables 

Set  #  B3:  semi-processed  data  collected  off-line  from  subjects 

L  SUB  TECTPVE  RATINGS  I:  After  each  experimental  block 

Rl.  Subjective  rating  (1-5)  of  team's  coordination  activities  (4  blocks  over  time). 

R2.  Subjective  rating  (1-5)  of  team's  radio-net  discipline  (4  blocks  over  time). 

R3.  Subjective  estimate  (1-5)  of  the  amount  of  information  obtained  from  other 
team  members  to  perform  job  (4  blocks  over  time). 

R4.  Subjective  estimate  (1-5)  of  the  number  of  measurements  (probes)  taken  per 
trial  (4  blocks  over  time). 

R5.  Subjective  estimate  (1-5)  of  the  number  of  times  communications  occurred 
with  other  team  members  (4  blocks  over  time). 

R6.Subjective  estimate  (1-5)  of  the  amount  of  time  spent  communicating  with  other 
team  members  (4  blocks  over  time). 

L  SIJB  TECTIVE  RATINGS  II:  Post-experiment 
TO  BE  CATEGORIZED 
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"SAINT*  EXPERIMENT:    NPS    Nov  -  Dec  91 
POST  SESSION  QUESTIONNAIRE 

Team*:    1     2    3        Trial* Date/Time 

4    5     6 
Please  place  an  X  any  where  on  the  line  that  best  reflects  your  response. 

1.  On  average,  how  difficult  were  the  scenario  trials  you  just  completed? 

I I I I I I 

Not  Difficult  Midpoint  Very 

At  All  Difficult 

2.  How  well  did  you  perform  your  specific  task? 

I I I I I I 

Not  Well  Midpoint  Very 

At  All  Well 

3.  How  would  you  rate  your  team's  coordination  activities  while  performing  target  identification? 

I I I I I I 

Not  Well  Midpoint  Very 

At  All  Well 

4.  How  well  did  you  exercise  radio-net  discipline? 

I I I I I I 

Not  Well  Midpoint  Very 

At  All  Well 

5.  How  well  did  the  team  exercise  radio-net  discipline? 

I I I I I I 

Not  Well  Midpoint  Very 

AtM  Well 

6.  In  addition  to  sensor  measurements,  how  much  information  did  you  get  from  other  team 
members  to  perform  your  job? 

I I I I I I 

Very  Midpoint  A  Great 

little  Deal 

7.  On  average,  how  many  measurements  did  you  take  per  trial  (for  the  scenario  trials  just 
completed)? 

I I I I I I 

0  2  4  6  8  lOor 

more 

8.  O  average,  how  many  times  did  you  communicate  with  another  team  member  (for  the  scenario 
trials  just  completed)? 

I I I I I I 

0  3  6  9  12  15  or 

more 

9.  On  average,  how  much  time  did  you  spend  communicating  with  other  team  members  (for  the 
scenario  trials  just  completed)? 

I I I I I I 

Very  Midpoint  A  Great 

Little  Deal 


SAINT  EXPERIMENT  (TADM!  IS  13299) 

Dependent  Variables 

Set  #  Bl :  semi-processed  data  collected  on-line  by  DDD  simulator 


J,  PERFORMANCE: 

ACCURACY 

PI.  Final  decision  by  Leader  (TAO)(l:  neutral,  2:  hostile) 
P2.  Final  error  (1:  correct,  0:  incorrect, -1:  no  decision) 
P3.  Final  confidence  (1:  low,  2:  moderate,  3:  high) 

TIMELINESS 

P4.  Time  remaining  at  final  decision  [sees] 

n.  STRATEGY: 

IL1:  Leader  (TAO) 

INFORMATION  INPUT/OUTPUT 

51.  Number  of  judgment  entries  by  leader  (without  final  decision) 

52.  Number  of  database  queries  by  leader  (to  see  subordinate's  latest  judgment) 

DECISIONMAKING 

53.  Number  of  judgment  changes  by  leader  over  time  (e.g.  1121  — >  2  changes) 

54.  Initial  leader's  judgment  (1:  neutral,  2:  hostile) 

55.  Initial  leader's  confidence  (1:  low,  2:  moderate,  3:  high) 

IL3:  Subordinates  (IDS,  TIC,  EWS) 

INFORMATION  SEEKING 

56.  Total  number  of  probes  by  subordinates  (information  seeking  activity) 

57.  Information  seeking  rate:  S6  /  (Tinitial  -  Tfinal)  [probes/min] 

58.  Number  of  probes  by  DM1  (IDS) 
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S9.  Number  of  probes  by  DM2  (TIC) 

510.  Number  of  probes  by  DM3  (EWS) 

INFORMATION  RECORDING 

5 1 1 .  Number  of  database  entries  by  DM1 

5 12.  Number  of  database  entries  by  DM2 

5 1 3.  Number  of  database  entries  by  DM3 

514.  Total  number  of  database  entries  by  subordinates 

INFORMATION  PROCESSING 

515.  Initial  judgment  by  DM1  (1:  small,  2:  mid-size,  3:  large) 

51 6.  Initial  judgment  by  DM2  (1:  climbing,  2:  leveling-off,  3:  descending) 

517.  Initial  judgment  by  DM3  (1:  no  emission,  2:  search  ,  3:  fire  control) 

518.  Initial  confidence  of  DM1  (1;  low,  2:  moderate,  3:  high) 

519.  Initial  confidence  of  DM2  (1;  low,  2:  moderate,  3:  high) 

520.  Initial  confidence  of  DM3  (1;  low,  2:  moderate,  3:  high) 

521.  Final  judgment  by  DM1  (1:  small,  2:  mid-size,  3:  large) 

522.  Final  judgment  by  DM2  (1:  climbing,  2:  leveling-off,  3:  descending) 

523.  Final  judgment  by  DM3  (1:  no  emission,  2:  search  ,  3:  fire  control) 

524.  Final  confidence  of  DM1  (1;  low,  2:  moderate,  3:  high) 

525.  Final  confidence  of  DM2  (1;  low,  2:  moderate,  3:  high) 

526.  Final  confidence  of  DM3  (1;  low,  2:  moderate,  3:  high) 


m.  WORKLOAD: 


INDIVIDUAL  RAW  SWAT  SCORES  (T:  Time  pressure,  E:  Mental  Effort,  S:  Stress) 

Wl.  DMl's  T  score  (1:  low,  2:  medium,  3:  high) 

W2.  DMl's  E  score  (1:  low,  2:  medium,  3:  high) 

W3.  DMl's  S  score  (1:  low,  2:  medium,  3:  high) 

W4.  DM2's  T  score  (1:  low,  2:  medium,  3:  high) 

W5.  DM2's  E  score  (1:  low,  2:  medium,  3:  high) 

W6.  DM2's  S  score  (1:  low,  2:  medium,  3:  high) 

W7.  DM3's  T  score  (1:  low,  2:  medium,  3:  high) 

W8.  DM3'sE  score  (1:  low,  2:  medium,  3:  high) 
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W9.  DM3's  S  score  (1:  low,  2:  medium,  3:  high) 
W10.  Leader's  T  score  (1:  low,  2:  medium,  3:  high) 
Wll.  Leader's  E  score  (1:  low,  2:  medium,  3:  high) 
W12.  Leader's  S  score  (1:  low,  2:  medium,  3:  high) 

TEAM  CONSENSUS  RAW  SWAT  SCORES 

W13.  Team's  T  score  (1:  low,  2:  medium,  3:  high) 
W14.  Team's  E  score  (1:  low,  2:  medium,  3:  high) 
W15.  Team's  S  score  (1:  low,  2:  medium,  3:  high) 
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SAINT  EXPERIMENT  (TADMUS  J3299)   NPS  NOV-DEC  1991 


EXPERIMENT  DATA  SET  #1 


BJECTS 

Team  ID  [1,2, 3,4,5] 

DEPENDENT  VARIABLES 

Structure  [1:  no  feedback;  2:  TAO's  feedback] 

Uncertainty  [1:  low  (10%  jamming);  2:  high  (50%  jamming)] 

Time  pressure  [1:  low  (6  min . ) ;  2:  moderate  (4  min.);  3:  high  (2  min.)] 

ENARIO  ATTRIBUTES 

Target  hostility  [1:  neutral;  2:  hostile] 
Target  ambiguity  [1,2,3:  low;  4,5,6:  high] 
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SAINT  EXPERIMENT  (TAPMIIS    U299) 

Dependent  Variables 

Set  #  Al :  Aggregated  measures  with  categorization 
(based  on  variable  sets  Bl,  B2,  and  B3) 

I.  PERFORMANCE; 

ACCURACY 

API.  TAO's  final  judgment  (1:  neutral,  2:  hostile) 

AP2.  Confidence  on  final  judgment  (1:  low,  2:  moderate,  3:  high) 

AP3.  Final  composite  target  hostility  judgment  TO- 100%] 

AP4.  Error  rate  (according  to  ground  truth) " 

AP5.  False  alarm  rate  (false  positive) 

AP6.  Miss  rate  (false  negative) 


TIMELINESS 

AP7.  Latency  of  first  hostile/neutral  judgment  [sees] 
AP8.  Team  explicit  information  processing  time  [sees] 
AP9.   Slack  time  (time  remaining  at  final  decision)  [sees] 

SUBJECTIVE  PERFORMANCE 

AP10.  Error  rate  (according  to  TAO's  prior  subjective  hostility  ratings) 
API  1.  False  alarm  rate  (according  to  TAO's  prior  subjective  hostility  ratings) 
API  2.  Miss  rate  (according  to  TAO's  prior  subjective  hostility  ratings) 
API  3.  Discrepancy  factor  in  composite  target  hostility  judgment  (AP3  -  HR) 
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II.  STRATEGY: 


n.l:    Leader  (TAO) 

INFORMATION  INPUT/OUTPUT 

AS  1 .  Number  of  database  queries  by  leader  (to  see  subordinate's  latest  judgment) 

DECISIONMAKING 

AS2.  Leader's  initial  judgment  (1:  neutral,  2:  hostile) 

AS3.  Leader's  initial  confidence  (1:  low,  2:  moderate,  3:  high) 

AS4.  Leader's  initial  composite  target  hostility  judgment  [0-100%] 

AS5.  Number  of  judgment  changes  by  leader  over  time  (e.g..  1 1£1  — >  2  changes) 

AS6.  Change  in  leader's  confidence  over  time 

II.3:     Subordinates  (IDS,  TIC,  EWS) 

INFORMATION  SEEKING 

AS7.  Total  number  of  probes  by  subordinates  (information  seeking  activity) 

AS8.  Information  seeking  rate:  AS7  /  AP8  [probes/«£?r 

AS9.  Information  seeking  unbalance  among  subordinates  (std.  dev.  /  mean) 

INFORMATION  RECORDING 

AS  10.  Total  number  of  database  entries  by  subordinates 

INFORMATION  PROCESSING 

ASH.  Final  judgment  by  DM  1  (1:  small,  2:  mid-size,  3:  large) 

AS  12.  Final  judgment  by  DM2  (1:  climbing,  2:  leveling-off,  3:  descending) 

AS13.  Final  judgment  by  DM3  (1:  no  emission,  2:  search  ,  3:  fire  control) 

ASM.  Final  confidence  of  DM  1  (1;  low,  2:  moderate,  3:  high) 

AS15.  Final  confidence  of  DM2  (1;  low,  2:  moderate,  3:  high) 

AS  16.  Final  confidence  of  DM3  (1;  low,  2:  moderate,  3:  high) 

AS  17.  Final  composite  hostility  judgment  by  DM1  [0-100%] 

AS  1 8.  Final  composite  hostility  judgment  by  DM2  [0-100%] 

AS  19.  Final  composite  hostility  judgment  by  DM3  [0-100%] 

AS20.  Final  average  composite  hostility  judgment  by  subordinates  [0-100%] 
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in.   COORDINATION- 

COMMUNICATION-GLOBAL: 

AC1.  Total  number  of  messages  sent 

riOl    Hi    Mi  imiiii    -—  /A^1  /  AP8)  *  60  [msgs/min] 

COMMUNICATION-DIRECTION: 

messages  from  leader  to  subordinates  (down) 
usages  from  subordinates  to  leader  (up) 
messages  from  subordinate  to  subordinate  (horizontal) 
ninication  rate  (AC3  /  AP8)  *  60  [msgs/min]  - 
AC7.  Total  number  of  broadcast  messages  sent  by  leader 

COMMUNICATION-INFORMATION 

AC8.  Percentage  of  information  transfers 
AC9.  Percentage  of  information  requests 
AC10.  Percentage  of  non-informational  communications  (team  bolstering,  etc..) 

COMMUNICATION-INFORMATION  FLOW 

AC1 1 .  Total  number  of  information  requests  by  leader 

AC12.  Total  number  of  information  transfers  by  leader 

AC  13.  Total  number  of  information  transfers  by  subordinates  to  leader 

ACM.  Anticipation  ratio  (=  (AC13  -  AC1 1)  /  AC13)  [0-100%] 

COMMUNICATION-INFORMATION  GRANULARITY  AND  DIRECTION 
AC  15.  Total  number  of  processed  information  messages 
AC  16.  Number  of  processed  information  messages  sent  by  leader 
AC17.  Number  of  processed  information  messages  sent  by  subordinates 
AC1 8.  Total  number  of  raw  data  messages 
AC  19.  Total  number  of  raw  data  messages  sent  by  leader 
AC20.  Total  number  of  raw  data  messages  sent  by  subordinates 
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TV.  WORKLOAD: 

INDIVIDUAL 

AW1.  Normalized,  calibrated  SWAT  score  for  DM1  [0  -  100%] 
AW2.  Normalized,  calibrated  SWAT  score  for  DM2  [0  -  100%] 
AW3.  Normalized,  calibrated  SWAT  score  for  DM3  [0  - 100%] 
AW4.  Normalized,  calibrated  SWAT  score  for  leader  [0  -  100%] 

TEAM 

AW5.  Normalized,  calibrated  average  SWAT  score  for  subordinates  [0  -  100%] 

AW6.  Normalized,  calibrated  SWAT  score  for  team  (group  scale)  [0  - 100%] 

AW7.  Workload  unbalance  among  subordinates  (std.  dev.  /  mean) 
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IV.  SUBJECTIVE  RATINGS: 

IV.l  By  leader  (TAO) 

PERFORMANCE 
AR1. 

COORDINATION 

TEAMWORK 

IV.2  By  subordinates 
PERFORMANCE 
COORDINATION 
TEAMWORK 
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